A Segmentation-Free Recognition Technique to Assist Old Greek Handwritten Manuscript OCR

نویسندگان

  • Basilios Gatos
  • Kostas Ntzios
  • Ioannis Pratikakis
  • Sergios Petridis
  • Thomas Konidaris
  • Stavros J. Perantonis
چکیده

Recognition of old Greek manuscripts is essential for quick and efficient content exploitation of the valuable old Greek historical collections. In this paper, we focus on the problem of recognizing early Christian Greek manuscripts written in lower case letters. Based on the existence of hole regions in the majority of characters and character ligatures in these scripts, we propose a novel, segmentation-free, fast and efficient technique that assists the recognition procedure by tracing and recognizing the most frequently appearing characters or character ligatures. First, we detect hole regions that exist in the character body. Then, the protrusions in the outer contour outline of the connected components that contain the character hole regions are used for the classification of the area around holes to a specific character or a character ligature. The proposed method gives highly accurate results and offers great assistance to old Greek handwritten manuscript OCR.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Hazards in Segmentation of Handwritten Hindi Text

Optical Character Recognition (OCR) is a process to recognize the handwritten or printed scanned text with the help of a computer. Segmentation is very important stage of any text recognition system. The problems in segmentation can lead to decrease in segmentation rate and hence recognition rate. A good segmentation technique can improve the recognition rate. This paper deals with the hazards ...

متن کامل

Segmentation Based Optical Character Recognition for Handwritten Marathi characters

Valuable ancient documents like historical books, old scripts etc. are available in specific regional languages. Problems occur when those documents have to be preserve in digital form or to modify them. Optical Character Recognition is used to convert the scanned document word, notepad or any other format, so that we can easily edit that document. A complete OCR system for handwritten Devanaga...

متن کامل

Research of Chinese Handwritten Text Segmentation Algorithm

OCR is a complicated process, there are many factors that can influence the recognition rate. Early period people tried to optimize the classifier to obtain high recognition rate, but the premise is that there is only one character no matter print or handwritten. For the performance of classifier has been promoted a lot, recognition rate for single character is high enough for commercial use. W...

متن کامل

Segmentation of Handwritten Gurmukhi Text into Lines

Text line segmentation is an essential pre-processing stage for handwriting recognition in many Optical Character Recognition (OCR) systems. It is an important step because inaccurately segmented text lines will cause errors in the recognition stage. Text line segmentation of the handwritten documents is still one of the most complicated problems in developing a reliable OCR. The nature of hand...

متن کامل

K-Algorithm A Modified Technique for Noise Removal in Handwritten Documents

OCR has been an active research area since last few decades. OCR performs the recognition of the text in the scanned document image and converts it into editable form. The OCR process can have several stages like preprocessing, segmentation, recognition and post processing. The preprocessing stage is a crucial stage for the success of OCR, which mainly deals with noise removal. In the present p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004